local trend
DTW+S: Shape-based Comparison of Time-series with Ordered Local Trend
Measuring distance or similarity between time-series data is a fundamental aspect of many applications including classification, clustering, and ensembling/alignment. Existing measures may fail to capture similarities among local trends (shapes) and may even produce misleading results. Our goal is to develop a measure that looks for similar trends occurring around similar times and is easily interpretable for researchers in applied domains. This is particularly useful for applications where time-series have a sequence of meaningful local trends that are ordered, such as in epidemics (a surge to an increase to a peak to a decrease). We propose a novel measure, DTW+S, which creates an interpretable "closeness-preserving" matrix representation of the time-series, where each column represents local trends, and then it applies Dynamic Time Warping to compute distances between these matrices. We present a theoretical analysis that supports the choice of this representation. We demonstrate the utility of DTW+S in several tasks. For the clustering of epidemic curves, we show that DTW+S is the only measure able to produce good clustering compared to the baselines. For ensemble building, we propose a combination of DTW+S and barycenter averaging that results in the best preservation of characteristics of the underlying trajectories. We also demonstrate that our approach results in better classification compared to Dynamic Time Warping for a class of datasets, particularly when local trends rather than scale play a decisive role.
A Global-Local Approximation Framework for Large-Scale Gaussian Process Modeling
Vakayil, Akhil, Joseph, Roshan
In this work, we propose a novel framework for large-scale Gaussian process (GP) modeling. Contrary to the global, and local approximations proposed in the literature to address the computational bottleneck with exact GP modeling, we employ a combined global-local approach in building the approximation. Our framework uses a subset-of-data approach where the subset is a union of a set of global points designed to capture the global trend in the data, and a set of local points specific to a given testing location to capture the local trend around the testing location. The correlation function is also modeled as a combination of a global, and a local kernel. The performance of our framework, which we refer to as TwinGP, is on par or better than the state-of-the-art GP modeling methods at a fraction of their computational cost.
Laplacian Convolutional Representation for Traffic Time Series Imputation
Chen, Xinyu, Cheng, Zhanhong, Saunier, Nicolas, Sun, Lijun
Spatiotemporal traffic data imputation is of great significance in intelligent transportation systems and data-driven decision-making processes. To make an accurate reconstruction from partially observed traffic data, we assert the importance of characterizing both global and local trends in traffic time series. In the literature, substantial prior works have demonstrated the effectiveness of utilizing low-rankness property of traffic data by matrix/tensor completion models. In this study, we first introduce a Laplacian kernel to temporal regularization for characterizing local trends in traffic time series, which can be formulated in the form of circular convolution. Then, we develop a low-rank Laplacian convolutional representation (LCR) model by putting the nuclear norm of a circulant matrix and the Laplacian temporal regularization together, which is proved to meet a unified framework that takes a fast Fourier transform (FFT) solution in a relatively low time complexity. Through extensive experiments on some traffic datasets, we demonstrate the superiority of LCR for imputing traffic time series of various time series behaviors (e.g., data noises and strong/weak periodicity). The proposed LCR model is an efficient and effective solution to large-scale traffic data imputation over the existing baseline models. Despite the LCR's application to time series data, the key modeling idea lies in bridging the low-rank models and the Laplacian regularization through FFT, which is also applicable to image inpainting. The adapted datasets and Python implementation are publicly available at https://github.com/xinychen/transdim.
Classification of Spontaneous Speech of Individuals with Dementia Based on Automatic Prosody Analysis Using Support Vector Machines (SVM)
Ossewaarde, Roelant (Rijksuniversiteit Groningen) | Jonkers, Roel (Rijksuniversiteit Groningen) | Jalvingh, Fedor (Rijksuniversiteit Groningen) | Bastiaanse, Roelien (Rijksuniversiteit Groningen)
Analysis of spontaneous speech is an important tool for clinical linguists to diagnose various dementia types that affect the language processing areas. Prosody is affected by some dementia types, most notably Parkinson's disease (PD, degradation of voice quality, unstable pitch), Alzheimer's disease (AD, monotonic pitch), and the non-fluent type of Primary Progressive Aphasia (PPA-NF, hesitant, non-fluent speech). Prosodic features can be computed efficiently by software. In this study, we evaluate the performance of a SVM classifier that is trained on prosodic features only. The limitation to only prosody yields baseline results that can be used in a later stage to evaluate the added effect of variables of (morpho) syntax. The goal is to distinguish different dementia types based on the recorded speech. Results show that the classifier can distinguish some dementia types (PPA-NF, AD), but not others (PD, PPA-SD).